Method and system for image search

ABSTRACT

A method and system for image search, the method comprising: receiving an indication regarding at least one feature of at least one image from a collection of images; creating an updated search algorithm according to the indication; and providing an updated collection of images by using the updated search algorithm.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No.14/025,075, filed on Sep. 12, 2013, published as US Patent ApplicationPublication No. 2014/0012880, which is a Continuation-in-part (CIP) ofU.S. patent application Ser. No. 13/389,188, filed on Mar. 12, 2012,published as US Patent Application Publication No. 2012/0158784, whichis a National Phase Application of PCT International Application No.PCT/IL2010/000634, International Filing Date Aug. 5, 2010, entitled AMethod and System for Image Search, published on Feb. 10, 2011 asInternational Patent Application Publication No. WO 2011/016039,claiming priority of U.S. Provisional Patent Application No. 61/273,652,filed Aug. 6, 2009, each of which is incorporated herein by reference inits entirety.

BACKGROUND OF THE INVENTION

With the rapid growth of the internet and the users of the internet overthe past ten years, and the rapid increase in the amount of informationavailable over the internet, a need for special tools fordata/text/image/sequence of images/sounds search was developed. Manysearch engines are available to users and provide powerful tools forimage search. Search engines propose different strategies from oneanother in attempting to find images which are most relevant to theuser-specified search criteria. For example, one can define size ofimage (any size, extra-large, large, medium, small), type of image (anytype, news, face, clipart, line drawings, photo), color (all colors,red, green, black, etc.).

Most of the known image search engines attempt to receive relevantdocuments by filtering, wherein an interface is provided to allow theuser to set parameters to arrive at a set of relevant documents.

Some web-based search engines use data mining capabilities. Suchcapabilities may include clustering of images to groups by similartopics, which enables a search for the “nearest” results or for“similar” images. The clustering procedure may employ agroup-average-linkage technique to determine relative affinity betweendocuments. Additionally, clustering procedures may take into accountbehavior of similar users in the past. These clustering proceduresusually use off-line “profile-oriented” or “history-oriented” learningsystems. Additionally, some of these systems perform image search basedon corresponding text label associated with each image.

BRIEF DESCRIPTION OF THE DRAWINGS

The subject matter regarded as the invention is particularly pointed outand distinctly claimed in the concluding portion of the specification.The invention, however, both as to organization and method of operation,together with objects, features, and advantages thereof, may best beunderstood by reference to the following detailed description when readwith the accompanying drawings in which:

FIG. 1 is a flow-chart illustrating a method for image search accordingto embodiments of the present invention;

FIG. 2 is a flow-chart illustrating a method for creating an updatedsearch algorithm for searching for images which include similar and/oridentical features to features indicated by a user, according toembodiments of the present invention;

FIG. 3 is a table illustrating a method for image search according toembodiments of the present invention;

FIG. 4 is a table illustrating a method for image search according toembodiments of the present invention; and

FIG. 5 is a schematic illustration of a system for image searchaccording to embodiments of the present invention.

It will be appreciated that for simplicity and clarity of illustration,elements shown in the figures have not necessarily been drawn to scale.For example, the dimensions of some of the elements may be exaggeratedrelative to other elements for clarity. Further, where consideredappropriate, reference numerals may be repeated among the figures toindicate corresponding or analogous elements.

DETAILED DESCRIPTION OF THE INVENTION

In the following detailed description, numerous specific details are setforth in order to provide a thorough understanding of the invention.However, it will be understood by those skilled in the art that thepresent invention may be practiced without these specific details. Inother instances, well-known methods, procedures, and components have notbeen described in detail so as not to obscure the present invention.

The large volume of data available over the internet may cause anundesirable result in many of the known image search engines. Many ofthe simple searches may return large number of images, many of which maybe not useful or not relevant to what the user is seeking. On the otherhand, if a user defines his request in an extremely detailed manner(e.g., including years, country, type of information, etc.), the systemmay return a relatively small number of found (not necessary relevant)documents, but some important documents may be omitted.

Another drawback of known search engine systems is that they do notenable feedback from the user about the extent of success (or lack ofsuccess) of searches which were performed earlier and to use thisinformation for further “more thorough” search.

As mentioned above, some web-based search engines use clustering ofimages to groups by similar topics, which enables a search for the“nearest” results or for “similar” images. These clustering proceduresmay take into account behavior of similar users in the past, but theydon't enable taking into account on-line, dynamic profile of the actualspecific user.

Additionally, the known image search engines classify the images by textlabels associated with each image and not based on features within theimages

Embodiments of the present invention may provide a method and system forimage search engine which may overcome at least some of the majorlimitations of the known image search engines. The method and system forimage search engine according to embodiments of the present inventionmay be able to apply on-line learning procedures, for example, based ona given user input and/or requests, for improving search andclassification results.

The method and system for image search engine according to embodimentsof the present invention may utilize multi-stage procedure forstep-by-step convergence of the search results to a set of the closestsearch results according to the user's requirements.

Reference is now made to FIG. 1, which is a flow-chart illustrating amethod for image search according to embodiments of the presentinvention. As shown in block 110, the method may include providing acollection of images. In some embodiments of the present invention, thecollection of images may be provided in response to an initial searchinquiry received by a search system. The initial search inquiry whichmay be entered, for example, by a user. The initial search inquiry maybe any kind of text search known in the art. For example, the user mayenter any search term or combination of terms in attempt to define thedesired items that should be searched for. Accordingly the providedcollection of images may correspond to a search inquiry entered by auser. The result of the first search inquiry provides a first sub-set ofimages, The provided collection of images may include a very largenumber of images, which may be, in some cases, too many to enablereviewing of all of them by the user, moreover, some or most of theimages provided may be non-relevant to the user. Therefore, a refiningof the search may be required by the user. As shown in block 120, themethod may include receiving an indication, for example, from a user,regarding at least one image and optionally regarding at least twofeatures of at least one image from the provided sub-set of images. Forexample, a user may indicate the level of relevancy/suitability of atleast one of the images or at least one feature of the images. Forexample, a user may indicate the level of relevancy/suitability for animage by means of binary indication, for example, yes/no orrelevant/irrelevant, or by means of multilevel ranking ofrelevancy/suitability (for example, very relevant, somewhat relevant,irrelevant) or scoring.

In another example, the user may identify an image feature as arequested/desired feature or as a feature that closely suits the goalsof the search. For example, the user may identify a certain item/shapeand/or a certain color and/or spectrum of colors appearing in an image.For example, the user may mark an image or a portion of an image asincluding a desired item and/or shape and/or color/color spectrum for anext stage of the search. For the identification of image features, theuser may use graphical identification means, such as various predefinedfunctional markers which may be used on the images. For example, acertain kind of marker may be used for identifying a desired color.Another kind of marker, for example, may be used for identifying animage or a portion of an image which includes a desired spectrum ofcolors. Another kind of marker, for example, may be used for identifyingan image or a portion of an image which includes a desired shape and/oritem. Additionally, different predefined markers may be used to indicatethat a similar feature is requested or that an identical feature only isrequested exclusively. Other kinds of markers for various kinds ofidentifications may be used. The indications may be used for refiningthe search as described in detail herein below.

As shown in block 130, the method for image search according toembodiments of the present invention may include creating, according tothe received indication from a user, an updated search algorithm whichmay enable search for images which include at least one of similar andidentical features compared to the features indicated by the user. Forexample, search categorization functions may be created and added to theupdated search algorithm, which may enable image search andcategorization into at least two groups: suitable/non-suitable, based onthe content of the images and thus, for example, obtaining an updatedcollection of images based on the user indication of desired imagefeatures. The creation of an updated search algorithm may includecreation of an algorithm's control parameters such as threshold, forexample, to be implemented by the algorithm, to distinguish betweensuitable and non-suitable images/features, for example, based on arequired similarity level indicated by the user. In some embodiments,the user may be able to indicate for different marked features therequired similarity level for each of them.

As shown in block 140, the method for image search according toembodiments of the present invention may include providing an additionalbatch of images from said large collection of image. The additionalbatch of images does not include images found in previous batches ofimages. The additional batch is searched using the updated searchalgorithm of block 130 to provide next sub-set of images.

As shown in decision block 150, in case the user is satisfied with theupdated sub-set of images included in that next batch the process maystop here. In case an additional refining of the search is required, theuser may further mark images from the new batch, and the method mayrepeat from block 120 to block 140 until an additional refining of thesearch is not required by the user.

Embodiments of the present invention may allow a user to search forimages which include a specific visual feature or a combination ofvisual features marked at one image or at different images. In someembodiments, the search may be in accordance with a predefined requiredlevel of similarity to the initially indicated feature(s).

Reference is now made to FIG. 2, which is a flow-chart illustrating amethod for creating an updated search algorithm for searching for imageswhich include similar and/or identical features to features indicated bya user and/or included in images indicated by a user, according toembodiments of the present invention. In some embodiments of the presentinvention, as shown in block 210, the received indication regardingimages and/or image feature(s) may be transformed, for example,translated and/or coded into representative mathematical parametersand/or values, for example, by image processing methods and/or tools. Incase the received indications are about binary or multilevelrelevancy/suitability of the indicated image(s)/portion(s), thetransformation may first include identification of features in theindicated images/portions of images, for example by the image processingtools/methods, such as a shape, background and/or colors. Then, theidentified features may be transformed into representative mathematicalparameters and/or values. In case the received indications are aboutspecific features of the indicated image(s)/portion(s), the specificfeatures may be identified and then transformed into representativemathematical parameters and/or values.

Below is an example of an algorithm implementation-for the illustrationof the above. Consider data points, received after current stage ofperforming regular image search of the form: {(X[1], y[1]), (X[2],y[2]), . . . , (X[n], y[n]) where the y[i] is either 1 or −1 —this labeldenotes the class to which the point X[i] belongs—label 1 means, thatdocument belongs for class “suitable” for current feature Each of X[i]is a n dimensional vector of SIFT descriptor values. This set may beconsidered as training data, which denotes the correct classificationwhich the algorithm is eventually required to distinguish. The trainingis really followed for Find Similar (Relevance Feedback) Algorithmsolving: to calculate weights a[i,j] of single descriptors. Given twoimages and their corresponding SIFT descriptor vectors we define thesimilarity between two images simply as the number interest pointsshared between two images. The interest points are defined as “shared”when a pair of interest points (one from each image) has a L2 distanceof below certain threshold. Geometric post processing steps are alsoused to remove the outlier matches. For each non-marked document X avalue y(X) is calculated by y(X)=Σa[i,j]*y[i]*(X, X[i])+w. If y(X)>=0,the non-marked image X is recognized as “Suitable”, otherwise as“Non-Suitable”. It is necessary to note, that the above description isonly a particular example of an image recognition technique that may beemployed.

The image processing tools/methods may include at least one of thefollowing tools/methods: image pixel vectors categorization, Gaborfilter, Fourier Descriptor, Wavelet transform, Scale-Invariant FeatureTransform (SIFT), Speeded Up Robust Features (SURF), and/or any othersuitable tool/method.

Based on the resulted mathematical parameters and/or values, whichrepresent the indications regarding images and/or specific imagefeatures identified by a user as described above, as shown in block 220,search categorization functions may be created, which may be added to anupdated search algorithm. Based on the created categorization functions,the updated search algorithm may search for images which include similarand/or identical features to the features indicated by the user and/orincluded in images indicated by a user. In order to create the searchcategorization functions, computational learning tools/methods may beutilized to, for example, formulate general rules based on the useridentification of desired image features and/or of images includingdesired features, translated into mathematical parameters and/or values.Additionally, the computational learning tools/methods may be utilizedto, for example, formulate general rules based on the indicatedrelevancy/suitability level of indicated image(s)/portion(s), or ofindicated features, if applicable. The formulated rules may be employedin the search categorization functions. The computational learningtools/methods may include at least one of the following tools/methods:Support Vector Machine (SVM), Least Squares SVM (LS-SVM), one-class SVM,relevance feedback algorithms, logistic regression algorithms, neuralnetworks, decision trees, Bayesian networks, and/or any other suitabletool/method. Additionally, the search categorization functions mayinclude a threshold to distinguish between suitable and non-suitableimages/features, for example, based on a required similarity levelindicated by the user. The threshold may also be determined by thecomputational learning tools/methods mentioned above.

As mentioned above, as shown in block 230, the created searchcategorization functions may be added to an updated search algorithm,which may enable image search based on the content of the images andthus, for example, obtaining an updated collection of images based onthe user indication of desired images/features. Then, an updatedcollection of images may be provided by using the updated searchalgorithm.

In one exemplary embodiment, a user may mark an image, a portion of animage or a spot in the image with a marker which identifies the image,the portion or the spot as including a suitable color/color spectrum tothe user's requirements. Different markers may be used for marking thewhole image, a portion of the image or a spot in the image. As describedabove with reference to block 210, the indicated color spectrum may betransformed into representative mathematical parameters and/or values,for example, by an image processing tool. For example, the imageprocessing tool may identify the color spectrum included in the markedimage or portion of an image. Then, the same or another image processingtool may translate and/or code the identified color spectrum intorepresentative mathematical parameters and/or values. Based on theresulted mathematical parameters and/or values, as shown in block 220,search categorization functions may be created, which may be added to anupdated search algorithm as shown in block 230. Based on the createdcategorization functions, the updated search algorithm may search forimages which include similar and/or identical color spectrum to thecolor spectrum indicated by the user.

In another exemplary embodiment, a user may mark an image or a portionof an image with a marker which identifies the image or the portion asincluding an item(s)/shape(s)/shape edge(s) which is/are suitable to theuser's requirements. As described above with reference to block 210, theindicated item/shape may be transformed into representative mathematicalparameters and/or values, for example, by an image processing tool. Forexample, the image processing tool may identify a shape/item, forexample by detecting edges of a shape included in the marked image orportion of an image. Then, the same or another image processing tool maytranslate and/or code the identified shape into representativemathematical parameters and/or values. Based on the resultedmathematical parameters and/or values, as shown in block 220, searchcategorization functions may be created, which may be added to anupdated search algorithm as shown in block 230. Based on the createdcategorization functions, the updated search algorithm may search forimages which include similar and/or identical item(s)/shape(s)/shapeedge(s) to the item(s)/shape(s)/shape edge(s) indicated by the user.

In some embodiments, the user may mark one or several spots on the imagewith a marker for identifying the spots or points and their relativelocation in the image. Then, the identification of the spots or pointsand their relative locations may be translated and/or coded intorepresentative mathematical parameters and/or values, for example, by animage processing tool as shown in block 210. Based on the resultedmathematical parameters and/or values, as shown in block 220, searchcategorization functions may be created, which may be added to anupdated search algorithm as shown in block 230. Based on the createdcategorization functions, the updated search algorithm may search forimages which include similar and/or identical spots in the identifiedrelative locations as identified by the user.

In some embodiments of the present invention, user indications ofseveral different images may be used in order to search for images whichinclude a combination of features included in the indicated images.Reference is now made to FIG. 3, which is a table 300 illustrating amethod for image search according to embodiments of the presentinvention. Column 310 and 330 show three stages of the methods. Images50 and 60 shown in column 310 may be included in a larger collection ofimages not fully shown in table 300 and may be indicated by a user asrelevant or as having an extent of relevancy to the userrequirements/needs. As shown in column 330, for example, by the methodsdescribed in detail above with reference to FIGS. 1 and 2, image 70 maybe retrieved, which may include a combination of features included inthe indicated images 50 and 60.

In some embodiments of the present invention, user indications ofdifferent features, for example, in several different images, may becombined in order to search for images which include the combination ofthe indicated features. Reference is now made to FIG. 4, which is atable 300 a illustrating a method for image search according toembodiments of the present invention. Column 310 a, 320 a and 330 a showthree stages of the methods. Images 50 and 60 shown in column 310 a mayconstitute a collection of images or may be included in a largercollection of images not fully shown in table 300 a. As shown in column320 a, a user may indicate by markers 92 a-92 g shape edges in image 50which may define a requested shape in image 50, for example, a shape ofa flag. By markers 92 h-92 j, the user may indicate shape edges in image60, which may define a requested shape in image 60, for example of amaple leaf. By a marker 94, the user may indicate a requested color, forexample red color. As shown in column 330 a, for example, by the methodsdescribed in detail above with reference to FIGS. 1 and 2, image 70 maybe retrieved, which may include the identified requested shapes fromimages 50 and 60 and the identified requested color from image 60.

Reference is now made to FIG. 5 which is a schematic illustration of asystem 400 for image search according to embodiments of the presentinvention. The methods described in detail above may be executed bysystem 400. System 400 may include a user interface 410, a processor 420and a non-transitory processor-readable storage medium 430, which maystore instructions for processor 420. Processor 420 may receive, forexample, from user interface 410, an indication regarding at least oneimage or at least one feature of at least one image from a collection ofimages. Further to instructions which may be read from non-transitoryprocessor-readable storage medium 430, processor 420 may create anupdated search algorithm according to said indication, as described indetail above with reference to FIGS. 1-3. For example, image processingtools/methods and computational learning tools/methods may be used byprocessor 420 as described in detail above with reference to FIGS. 1-3,for example, further to instructions which may be read fromnon-transitory processor-readable storage medium 430. By the updatedsearch algorithm, processor 420 may provide to the user an updatedcollection of images, for example, further to instructions which may beread from non-transitory processor-readable storage medium 430.

While certain features of the invention have been illustrated anddescribed herein, many modifications, substitutions, changes, andequivalents will now occur to those of ordinary skill in the art. It is,therefore, to be understood that the appended claims are intended tocover all such modifications and changes as fall within the true spiritof the invention.

1. A method for image search comprising: (a) providing an initial largecollection of images; (b) providing a first sub-set of images from saidlarge collection; (c) creating a search algorithm; (d) receiving from auser an indication regarding level of relevancy of at least two featuresof at least one image from the sub-set of images, and a similarity levelfor each of the at least two features; (e) calculating updated values ofcontrol parameters of the search algorithm according to said indication,wherein the control parameters comprise a threshold to distinguishbetween suitable and non-suitable images and features of images andweights of descriptors; using the search algorithm for: (f) providing anext batch of images from said large collection which does not includeimages included in previous batches of images; and (g) providing a nextsub-set of images by searching said next batch of images using saidsearch algorithm, wherein the search algorithm is configured todistinguish between suitable and non-suitable images and features ofimages by implementing the updated control parameters; and (h) repeatingitems (d)-(g) until additional refining of the image search is notrequired by the user.
 2. A method according to claim 1, whereincalculating updated values of control parameters of said searchalgorithm comprises: transforming the indication regarding at least twofeatures into representative mathematical parameters; creating searchcategorization functions based on the mathematical parameters; andadding the created search categorization functions an updated searchalgorithm.
 3. A method according to claim 1, wherein the threshold iscreated based on the similarity level indicated by the user.
 4. A methodaccording to claim 1, wherein receiving an indication regarding at leasttwo features of at least one image comprises receiving an indicationregarding the level of one or more from a list comprising relevancy andsuitability of at least one of image or at least two features of atleast one image.
 5. A method according to claim 4, wherein theindication regarding the level of at least one from the list comprisingrelevancy and suitability of at least one image or at least two featuresof at least one image is by means of binary indication.
 6. A methodaccording to claim 5, wherein the indication regarding the level ofrelevancy and suitability of at least one image or at least two featuresof at least one image is by means of multilevel ranking.
 7. A methodaccording to claim 1, wherein each of the at least two features isselected from: an item appearing in the image, a shape appearing in theimage, a color appearing in the image and a spectrum of colors appearingin the image.
 8. A non-transitory processor-readable storage mediumhaving instructions stored thereon that, when executed by a processor,cause the processor to perform the steps of: (a) providing an initiallarge collection of images; (b) providing a first sub-set of images fromsaid large collection; (c) creating a search algorithm; (d) receiving anindication regarding level of relevancy of at least two features of atleast one image from the sub-set of images, and a similarity level foreach of the at least two features; (e) updating the search algorithmaccording to said indication, by calculating updated values of controlparameters, wherein the control parameters comprise a threshold todistinguish between suitable and non-suitable images and features ofimages and weights of descriptors; using the search algorithm for: (f)providing a next batch of images from said large collection which doesnot include images included in previous batches of images; and (g)providing a next sub-set of images by searching said next batch usingthe updated search algorithm, wherein the updated search algorithm isconfigured to distinguish between suitable and non-suitable images andfeatures of images by implementing the updated control parameters; and(h) repeating items (d)-(g) until additional refining of the imagesearch is not required by the user.
 9. The non-transitoryprocessor-readable storage medium according to claim 8, wherein saidinstructions stored thereon cause the processor to perform the furthersteps of: transforming the indication regarding at least two featuresinto representative mathematical parameters; creating searchcategorization functions based on the mathematical parameters; andadding the created search categorization functions an updated searchalgorithm.
 10. The non-transitory processor-readable storage mediumaccording to claim 8, wherein said instructions stored thereon cause theprocessor to perform the further steps of: receiving an indicationregarding the level of relevancy and suitability of at least one ofimage or at least two features of at least one image.
 11. Thenon-transitory processor-readable storage medium according to claim 8,wherein each of the at least two features is selected from: an itemappearing in the image, a shape appearing in the image, a colorappearing in the image and a spectrum of colors appearing in the image.